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1.0 Introduction 


1 . 1 Background — Scaling and the Limits of Scaling 

The very existence of Single Event Effects (SEE) is a consequence of scaling. Long- 
time NSREC attendees can remember when there were no SEE, because device sizes had 
not been scaled down enough for a single particle to have any detectable effect. By scaling, 
of course, we mean the consistent reduction in the size of electronic devices, which has been 
a hallmark of the semiconductor industry. Scaling is normally said to be governed by 
Moore’s Law. 1,2 Originally proposed in 1965, Moore’s Law stated that the number of 
transistors on a chip would double every year for the next ten years. The data available to 
Moore in 1965 and his original prediction are shown in Fig 1 .' Moore also discussed the cost 
of an integrated circuit as a function of complexity, shown in Fig. 2. The point was that 
adding components to a chip reduced the cost per component up to some point in any given 
year. Beyond that point, the cost per component started to rise again because of poor yields. 
In any given year, cost of the circuit as a function of number of components was a u-shaped 
curve, where the cost dropped from year to year, and the minimum cost fell at higher 
component counts each year. In 1965, the lowest cost per transistor was for circuits with 
about 50 transistors. He predicted that the minimum cost circuits would be at about 65000 
transistors by 1975, and that the chips would occupy about one quarter of a square inch. 
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Figure 1 . Moore’s original prediction — the number of transistors per chip 
doubling every year for ten years, 1965 to 1975. 1 
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Figure 2. Moore’s analysis of the cost per transistor by year, from 1965. He 
predicted that by 1975, the lowest cost per transistor would be achieved for 
chips at about the 64K level. 1 

In 1975, Moore published another paper discussing both the original prediction, and 
what had actually happened. 2 The main result is shown in Fig. 3, where the data fits the 
prediction remarkably well. He also analyzed three factors, which contributed to the 
increasing integration of complex circuits, shown in Fig. 4. From 1959 to 1975, the number 
of transistors on the largest chips had increased from one to about 64K. First, during the 
same period, the area of the largest chips had increased by a factor of about 20. Second, the 
square of the minimum feature size had decreased by about a factor of 32 in that period. 
(The earliest integrated circuits, in 1961, had line-widths of about 25 pm, which was 
reduced below about 5 pm by 1975.) The third factor was what Moore called “device and 
circuit cleverness,” which accounted for the remaining factor of 1 00, making it the largest of 
the three factors. By device and circuit cleverness, Moore meant that less space between 
transistors was used for isolation structures and metal interconnects each year, allowing 
more transistors to be added in any given area. For example, running metal interconnects 
over the top of active devices, rather than between them, allowed a larger fraction of the 
total area to be devoted to active devices. Moore predicted, in the 1975 paper, that the 
progress due to device and circuit cleverness would not continue at the same rate in the 
future, although he expected increasing chip area and reduced device feature size to continue 
to follow the same trends. Therefore, the doubling period should increase, perhaps (he said) 
to two years, instead of one year. Actual experience since then has been that the number of 
transistors on a chip really doubles about every eighteen months. 


2003 IEEE Nuclear and Space Radiation Effects Conference Short Course, Monterey, California - July 21, 2003 4 





Figure 3. Comparison of actual data with Moore’s original prediction. 2 



Figure 4. Moore’s analysis of the factors contributing to the increased 
complexity of the most advanced chips, increased die size, reduced feature 
size, and device and circuit cleverness. Device and circuit factor was the 
largest of the three factors. 2 
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Scaling of MOS transistors has been reduced almost to following a recipe. Dennard 
et al., 3 in 1974, introduced a dimensionless scaling constant, which they called k. Each 
linear dimension was reduced by a factor of k, and the doping levels had to be increased by 
a factor of k, to shrink the device area by k 2 . Then, to maintain constant fields, the applied 
voltage had to also be reduced by a factor of k. In the example they discussed, they 
described how to shrink a 5 pm device to 1 pm (k=5), which is illustrated in Fig. 5. This is 
the origin of the term scaling — each dimension scaled by the same factor. Since the industry 
was generally not willing to reduce power supply voltages that quickly, constant fields were 
not really maintained, but modified scaling based on Dennard’ s recipe has been followed for 
many years Dennard and others have published several other papers in later years, extending 
their original work, most recently discussing devices down to 25 nm. 3 " 7 Although bipolar 
devices have also been greatly reduced in size during this period, there is not a simple recipe 
for scaling bipolar devices. 



(a) (b) 

Figure 5. Scaling recipe for CMOS, after Dennard. 3 Each dimension is 
reduced by the same scale factor, k, in each generation of technology. 

Scaling has been institutionalized to the point that the Semiconductor Industry 
Association (SIA) maintains the International Technology Roadmap for Semiconductors 
(ITRS), which is basically a guide to scaling. 8 The roadmap projects technology 
development for the next fifteen years, including integration levels, feature size, speed, 
power, and many other things. . Normally, 1/k is about 0.7 in scaling from any given 
technology generation to the next. The ITRS also discusses in some detail the technical 
barriers, which must be overcome in order to continue following Moore’s Law. Technical 
barriers for which there is no known solution, are coded in red, and the amount of red in the 
charts has tended to increase each year. An example of a “red brick wall chart” is shown in 
Fig. 6. This chart deals with particulate control, and says there is no known method for 
controlling particulates, which will be adequate for the requirements of 2006. Even so, the 
industry expects scaling as we have known it, to continue for several more years — solutions 
for these problems are expected to be found when they are needed. More important, the 
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industry expects two of the main consequences of scaling to continue as well. First, the SIA 
assumes the cost per function will continue to decline by its historic average, 25 % per year. 
Second, as a result, the market for integrated circuits will continue to grow by its historic 
average, 17% per year. The ITRS is revised every year, with major revisions in odd 
numbered years. 
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Figure 6. Red Brick Wall Chart , from the ITRS, 8 all red starting in 2006. 


The importance of scaling in the semiconductor industry has spawned a significant 
subsidiary industry — the writing of papers predicting the end of scaling, or discussing the 
limits of scaling. As a result, the literature is littered with erroneous predictions that feature 
sizes would not be reduced beyond some point because of some perceived technical barrier. 
A summary of early work of this sort was presented by Folberth and Bleher, 9 and is 
summarized in Fig 7. One can see that the early predictions of the minimum device size 
differ by four orders of magnitude, depending on the author. The first prediction, by 
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Swanson 10 and Landauer, 11 is not specific to digital semiconductor circuits, but rather refers 
to any type of storage device. These authors argue that any storage element containing less 
than about 100 atoms will be unstable because of random thermal agitation. Certainly, when 
the energy to switch a memory element approaches kT, a fundamental limit will have been 
reached. Devices containing only 100 atoms were so far beyond the actual state of the art at 
the time, that the work of Swanson and Landauer attracted relatively little attention. Indeed, 
it is still beyond the current state of the art, and in fact, also beyond the end of the current 
ITRS, which projects to 2016. Therefore, this prediction will not be contradicted by 
experiment in the foreseeable future. 


Wallmark and Marcus (1962) -Edge uncertainty 

Doping fluctuations 


Hoeneisen and Mead (1972) - Doping fluctuations 

Extension of space 
charge regions 


Keyes (1972)- Breakdown and 

Heat generation reachthrough effects 

Metal migration 


Landauer and Swanson (1962) - 
Thermal noise 

Figure 7. Summary of early predictions of the end of scaling. 9 Expert 
predictions differed by four orders of magnitude. 

The second prediction of the limits of scaling was by Wallmark and Marcus, in 1962, 
in a paper familiar to many NSREC attendees. 12 They concluded “the minimum device size 
under reasonable conditions is approximately (10 pm) 3 , which is not far from devices now 
in the planning stage and within reach of existing techniques. It is within a factor of 2-5 of 
the dimensions of the active region of many devices of today.” While this prediction seems 
humorous today, the fact is that Wallmark and Marcus reached it because they anticipated 
Single Event Effects (SEE) before they were observed. They argued that ionization from 
sea level cosmic rays would render devices unreliable at smaller feature sizes. They also 
analyzed the effects of naturally occurring background radiation (the so-called alpha particle 
problem), the effects of heat generation, the effects of fluctuation in doping levels, and other 
things. These results are summarized in Fig 8. Because they anticipated these problems 
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before they were observed, one could argue that Wallmark and Marcus were prophets, 
despite their incorrect prediction. The mistake they made was in assuming that once SEE 
was observed, nothing could be done about it. 



BELOW THIS LINE 

Figure 8. Wallmark and Marcus prediction that sea-level cosmic rays would 
render devices unreliable below 10 pm feature size. 12 
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About ten years later, in 1972, Honeisen and Mead presented two papers on 
fundamental limits in microelectronics. 13, 14 They argued that the limit for dynamic MOS 
transistors would be at about 1 0 7 transistors per chip, which they predicted would be reached 
around 1980, as in Fig 9. They analyzed many potentially limiting reliability problems 
(power dissipation, metal electro-migration, substrate doping fluctuations, substrate 
breakdown, punch-through, and gate oxide breakdown). They concluded that gate oxide 
breakdown was the most severe problem, the one that would ultimately limit scaling. In 
their analysis, they considered 0.25 pm transistors with 7.2 nm gate oxides and a 1-V power 
supply. Perhaps the problem was that 7.2 nm oxides, grown in 1 972 by graduate students, 
really weren’t very reliable. 



Figure 9. Honeisen and Mead prediction of the limits of scaling, 13 due to gate 
oxide reliability, at about 10 7 transistors per chip, in 1980. 

Keyes, in a series of papers, has considered limiting factors for microelectronics, 
concluding that the ability to dissipate power will limit miniaturization. 15 ' 17 Wisely, he did 
not specify a limiting feature size. 

A good, and much more recent, discussion of physical limits in microelectronics has 
been presented by Plummer and Griffin, 18 who discuss difficult problems that require 
solutions if the ITRS is to be met. But in the end, they also say that solutions will probably 
be found. For example, gate oxide thickness scaling of pure SiC >2 cannot continue beyond 1- 
1.5 nm, which is close to the present state-of-the-art, because of direct tunneling current, 
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which increases power dissipation unacceptably and degrades oxide reliability. Therefore, 
research on alternative gate dielectrics is proceeding, and it has to succeed. Similarly, gate 
threshold voltage has to be reduced as power supplies are reduced, in order to maintain 
current drive, and switching speed. But the threshold voltage also has to be significantly 
above zero to minimize leakage current and power dissipation. For these reasons, the 
acceptable range of threshold voltages is being squeezed from both sides. The need for very 
shallow, highly doped junctions is another example. The concentration of dopants is 
approaching the solubility limits in Si, on the one hand. But on the other hand, shallow 
implants necessary to reach these concentrations also require high temperature annealing 
steps to repair implant damage. The high temperature annealing steps also allow dopants to 
diffuse away, reducing concentrations. However, Plummer and Griffin conclude that 
solutions for these and other problems will probably be found, “because of the enormous 
economic incentive to continue density and performance improvements.” In other words, 
when billions of dollars in profits depend on overcoming specific technical barriers, 
technologists have historically been highly motivated, and amazingly resourceful. 

The profitability of the semiconductor industry is what has driven the industry to stay 
with Moore’s Law. However, Moore, himself, noted in 1979, that (even then) most Intel 
customers did not require and did not buy products approaching the performance limits of 
Moore’s Law. 19 Although the ITRS assumes the industry will continue its historic growth 
pattern, there is some indication that the growth of the industry will slow down soon. The 
reason was pointed out by Myers. 20 The market for finished electronic products (TV sets, 
computers, cell phones, etc.) has historically grown at about seven percent CAGR 
(composite average growth rate), which is at least twice the growth of the economy as a 
whole, but much less than the chip industry, which has had a CAGR of about 17% since at 
least 1960. The percentage of the finished product due to the chip content had increased 
from about two percent in 1969 to 14% in 1996. The point is that the value of the chips 
cannot exceed the value of the end item they are built into, so the growth curve for the chip 
industry will have to change slope, and follow the curve for the whole electronics industry at 
some point. Myers projected this change would happen between 2000 and 2010 — perhaps 
about now. If the semiconductor industry is less profitable in the future, it may be less 
willing to continue the investments necessary to stay on Moore’s Law. The challenge will 
be economic, rather than technical. 
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1 .2 Basic Definitions and Concepts 


• LET — linear energy transfer, or energy loss per unit path-length. 

• Cross section — sensitive area, such as the area in which an ion strike causes an upset. 

• Critical charge — minimum charge to cause a specific circuit effect, such as a 
memory cell upset. 

• SEU — single event upset, also referred to as a soft error, a change of state of a 
memory cell, where stored information is lost, but the cell is not damaged, induced 
by a single ion. 

• SEL — single event latchup, regenerative high current state which occurs in four 
layer, pnpn, structures. Latchup is triggered in different ways, including single ions. 
Once triggered, the high current state is maintained until power to the circuit is 
turned off. Latchup is potentially destructive, because the high current can bum out 
critical parts of the circuit. 

• SESB — single event snapback, regenerative high current mode related to parasitic 
bipolar action, similar to latchup, except that it occurs in three layer structures. 

• SEB — single event burnout, typically observed in power devices. Either a parasitic 
bipolar device (in a MOSFET) or a bipolar device is turned on by a single ion, 
resulting in a high current condition that bums out metal electodes. 

• SEGR — single event gate rupture, localized destruction of the gate oxide along an 
ion track, resulting in a hard short between gate and substrate. 

• SET — current or voltage noise pulse generated by a single ion, which disrupts the 
operation of the circuit. 

• SHE — single hard error, also known as a stuck bit, total dose-induced failure of a 
memory cell, caused by a single ion. 
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2.0 Basic Mechanisms 


In this section, we discuss the physical processes involved in charge deposition, 
recombination, and transport in electronic materials, both semiconductors and the 
accompanying dielectric layers. 


2. 1 Charge deposition — track structure effects 

The charge deposition in any material is proportional to the LET, which determines 
the energy lost by an incident particle. However, to determine the charge deposited, one 
also has to know the electron-hole pair creation energy, which is different in each material. 
Parameters for a number of materials of interest are summarized in Table I, including charge 
pairs per unit path-length per unit LET for an incident ion, where LET is given in units of 
MeV/mg/cm 2 . 


Table 1. Summary of Materials of Interest. 


Material 

E P (ev) 

p (g/cm 3 ) 

Pairs/pm per unit LET 

Si 

3.6 

2.33 

6.47E4 

GaAs 

4.8 

5.32 

1.11E5 

Si0 2 

17 

2.2 

1.29E4 


In the early days of SEE studies, it was common to compare results based (only) on 
the LET of the incident particle. However, it was not long before it was found that particles 
at different energy with the same LET produced different effects. 21 For this reason, other 
studies were performed to better understand the track structure. 22, 23 Calculated track 
structures for selected ions are shown in Fig. 10. 23 In both cases, there is a dense central 
core, and the density falls off rapidly with radial distance. In Fig. 10, both ions have the 
same LET, so the core structure is very similar. The only difference is that delta rays, 
knock-on electrons, have a greater range for the higher energy ion, so the maximum radius is 
greater. However, the vertical axis is a log scale, and only a very small fraction of the 
charge is deposited at relatively large radii by delta rays. The radial track distribution has 
also been measured experimentally, using a detector with a set of concentric rings. 22 Typical 
results are shown in Fig. 11. The detectors are about 0.5 pm wide, and separated by 0.5 pm. 
Measured results are given by contact number, but the data points are about 1 pm apart. 
Measurements are compared with calculated charge deposition, with reasonable agreement. 
Since the track diameter may be comparable to the device size, or sometimes larger, it seems 
self-evident that device simulations have to take into account the structure of the track. One 
point still the subject of some controversy is the use of a gaussian approximation for the 
radial charge distribution in an ion track. Gaussian approximations have been used by some 
authors, 24 and criticized as inadequate by others. 23 Basically, a gaussian distribution can 
approach reality for the high density core of the track, where most of the charge is. But it 
does not predict the low-density tail of the track distribution at large radii (which is usually 
only a small part of the charge, however). 
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Radial Distance (p.m) 

Figure 10. Calculated track structures for two ions with similar LET, after 
Dodd. 23 Delta rays have different ranges, but structure is very similar in the 
high density core. 
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Howard et al. 22 

This discussion applies only to silicon so far. In oxides, there are three main 
differences. First, the radius of the dense core of the track is even smaller. The reason is 
that carriers in polar materials lose energy to the lattice by optical phonon interactions. 
When a charged particle passes through a medium, the dominant energy loss mechanism is 
the production of plasmons, which then decay to electron-hole pairs. 25, 26 In SiCh, the 
plasmon energy is 22eV, and the electron-hole pair energy is 1 7 eV, 28 leaving about 5 eV 
of energy lost when the carriers reach thermal equilibrium with the lattice. This energy is 
lost through the emission of optical phonons, with an energy of about 0.1 eV, 29-32 and a 
mean free path for emission of about 0.1 nm. 33 That is, the excess kinetic energy is lost in 
about 50 steps of 0.1 nm, but this is a random walk, where the mean distance traveled is 
given by the square root of the number of steps, rather than the number itself. In other 
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words, the extra kinetic energy of the charge pair is lost almost immediately, and the carriers 
reach equilibrium with the lattice almost where they are created. The optical phonon energy 
is proportional to the mass difference between Si atoms and O atoms in SiC> 2 , or to M 1 -M 2 in 
general. Of course, in Si, Mi is equal to M 2 , so there is no optical phonon effect. Therefore, 
the carriers retain their kinetic energy longer, and diffuse farther before reaching equilibrium 
with the lattice, so the initial track diameter is larger than in the oxide. The second 
difference is that recombination is a much stronger effect in the oxide than in Si, as we will 
discuss shortly. The third difference is that, because oxides are so thin, there are fewer 
charges produced to begin with, and even fewer survive recombination. For these reasons, 
many problems of practical interest in the oxide involve only 100-1000 charges. 
Determining the low-density tail of the charge distribution is, therefore, of less practical 
importance. The track structure in SiC >2 has not been studied as much as in Si, partly for 
these reasons, but also because most of the work on SiC >2 was done before the detailed 
studies in Si. 


2.2 Recombination 

Recombination has been treated fully previously. 34 In general, there are two models, 
which describe limiting cases. The geminate model describes the case where electron-hole 
pairs are far apart, and can be treated as isolated. The columnar model describes the case 
where there is a dense column of charge, and the separation between an electron and a hole 
from a given pair is greater than the mean separation between pairs. The columnar model is 
most relevant in a discussion of SEE, because ions create dense columns of charge. The 
columnar model was originally developed by Jaffe, who used it to model ionization in gases, 
in the early part of the last century. 35 The equation that Jaffe developed had three terms: a 
bimolecular recombination term originally proposed by Langevin, a diffusion term, and a 
drift term to account for the effects of any applied field. Jaffe’ s original analytical solution 
began by solving the diffusion term first, getting cylindrical distributions of positive and 
negative charges, and then letting the cylinders move past each other in response to the 
applied fields, as illustrated in Fig. 12. Finally, he reintroduced the effect of the 
recombination term. The problem with this approach, treating recombination as a 
perturbation, is that the recombination term is the largest term, not the smallest. Even so, 
Jaffe was able to get reasonable agreement between his experiments and his theory. But 
work applying the Jaffe model to SiC >2 has all used numerical methods to solve the whole 
equation, without neglecting any terms. In the work applying the columnar model to SiC> 2 , it 
has usually been assumed that the radial track distribution is gaussian, with a half diameter, 
b, of 3.5 nm. 3641 Typical experimental results for alpha particles and protons are shown in 
Fig. 13 and Fig. 14, respectively. Calculated results for a range of other ions and applied 
fields are shown in Fig. 15. In each case, the applied field, plotted on the horizontal axis, 
refers only to the component of the field normal to the axis of the cylinder. Depending on 
the ion type and field, the yield of charge ranges from about 0.3 for protons, to 0.1 for alpha 
particles, to 0.01 or less for heavier ions. On the other hand, for high fields, yield in a Co-60 
source will typically be much higher, 0.8 or 0.9, so recombination is an important effect 
when dealing with any type of ion in Si02. The actual yield for an ion in Si02 is a critical 
question when discussing stuck bits, which we will do later. 
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Figure 12. Columnar recombination, cylinders of charge moving under the 
influence of normal and parallel field components. 



Figure 13. Alpha particle recombination in SiC> 2 , compared to model 
predictions, as a function of field. 
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Figure 14. Recombination for 700 keV protons as a function of field, 
compared to model predictions. 
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Figure 15. Model predictions of recombination for different normal field 
components and LETs. 
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The biggest discrepancy between columnar model results and experiment is that for 
very high LET ions, the measured charge yield is often around 0.01, but the calculated result 
is perhaps an order of magnitude less, especially with no normal field component (for 
example, Fig.16). 38 A more accurate initial track structure would probably improve this 
situation. Much of our understanding of the track structure in Si has been developed since 
this recombination work was done. Similar analysis of the track structure in SiC >2 would be 
a useful thing to do. 



Figure 16. Recombination measurements, compared with model predictions. 

At very high LET, and low field, difference may be as much as an order of 
magnitude? 8 

A problem of real practical interest currently is related to the application of this 
recombination to the problem of total dose from protons. One basic problem is that, as we 
have said, there are two models that treat limiting cases, where the charge pairs are either 
close together or far apart. But many practical experimental results fall in the transition 
region, where the experimental conditions do not satisfy the assumptions of either model. In 
Fig. 1 7, the LET curve for electrons satisfies the geminate assumption, that the charge pairs 
are far apart. For heavy ions, satisfying the columnar assumption, the LET is off the top of 
the chart. The proton LETs, shown in Fig. 17, are intermediate. The critical proton 
recombination results are shown in Fig. 18. The dashed line in Fig. 18 indicates an attempt 
to predict recombination in this transition region, and it fits the experimental results fairly 
well. The LET for protons does not reach the value of LET for 1-2 MeV electrons, which is 
the real geminate limit, until the proton energy reaches about 1000 MeV. Earlier versions of 
this figure had somewhat different curves especially in the transition region. 42 However, 
recent experiments 43, 44 show somewhat less yield than had been reported earlier, falling 
below the dashed line. Therefore, the dotted line is probably a better approximation of what 
happens in the transition region. There is still a columnar component to the recombination 
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process for protons between 1 00 and 200 MeV, even though the assumption of the model, 
that the charge pairs are very close together, is not strictly satisfied. 



.001 .01 0.1 1.0 10.0 100. 

PARTICLE ENER6Y (MeV) 

Figure 17. LET for protons and electrons at different energies. 42 
Recombination for electrons is described by the geminate model, while 
columnar model is appropriate for high Z ions (not shown). Protons at many 
energies of practical interest are in an intermediate range. 



PROTON ENERGY (MEV) 

Figure 18. Measured proton recombination, different authors and different 
energies, compared to model results. 
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Finally, we have already noted that recombination is a much stronger process in 
insulators than in semiconductors, and discussed one of the contributing factors, the fact that 
the greater track diameter results in a lower charge density in semiconductors. There is 
actually a great deal of data on recombination in Si in the literature, however, because of 
work by the Si detector community. Typically, there is a difference between the current 
pulse produced in a detector by an ion, and the pulse that would be expected if all the energy 
of the ion were converted to ionization. This difference is called the pulse height defect, or 
PHD, and it consists of three components. The first of these is the energy lost by the ion 
passing through the metal electrode and other over-layers, which can be estimated very 
accurately from the LET if the composition and thickness of the layers is known. The 
second component is non-ionizing energy loss from displacement damage, especially at the 
end of the track. This component can also be estimated very accurately from the standard 
theory, the LSS model. 45 The remaining PHD is from recombination, and it is in the range of 
1 0% or less for most ions of practical interest to this community. 46-48 For the very high Z 
ions, or high LET fission fragments, recombination is somewhat higher, but still much less 
than in insulators. In general, the result will depend on the applied field and the resistivity 
of the detector. We have mentioned the fact that the charges are initially deposited farther 
apart in Si than in SiC >2 as one reason for this. The other reason is that screening lengths are 
much shorter in semiconductors than in insulators. Normally, if a free charge is introduced 
into a material, the other carriers in the material will move to screen off the field, 49 so the 
potential and the field to fall off faster with increasing radius than would otherwise be the 
case. Of course, the greater the density of free carriers, the shorter the screening length. 49 If 
the screening distance is greater than the separation between the charges, then they will exert 
electrostatic coulomb forces on each other, bringing positive and negative charges together 
to recombine, as is often observed in insulators. If the screening distance is less than the 
separation between charges, then the fields are screened, and the charges do not exert 
electrostatic forces on each other. Therefore, the positive and negative charges are not 
pulled together, and recombination events are rare, as is usually the case in semiconductors. 


2.3 Charge transport and collection — fimneling 

Following the discovery of SEE, the first discussion of charge transport and 
collection was by Kirkpatrick, 50 who assumed that charge transported primarily by 
diffusion. He then worked out a number of examples, such as the illustration in Fig. 1 9. He 
concluded that the charge collected by a device decreased less rapidly with scaling than the 
critical charge required to cause an error. “Thus the charge margins preventing soft failures 
in most current devices will vanish if the devices are made smaller without significant 
design changes.” 
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Figure 19. Charge collection across an array, assuming transport is primarily 
by diffusion. 


Shortly afterwards, however, Hsieh et al. 51-53 reported what they called the “field- 
funnel effect,” which meant that charge collection by drift, under the influence of an applied 
field, was more significant than originally realized. The funnel effect was originally 
discovered in device simulations, using the FIELDAY code, 54 and confirmed 
experimentally. But IBM declined to make the code available to the government, so the 
DOD began an effort to develop its own codes. In the meantime, several simple analytical 
models were proposed for use until codes became available. ' These models were widely 
used for a time, especially the one by McLean and Oldham, until the codes were ready. The 
basic ideas are illustrated in Fig. 20 and Fig. 21. Qualitatively, these models and the codes 
reflect the same physical processes. First, the dense electron-hole plasma, which is formed 
along the ion track, collapses the junction depletion layer, leaving, in effect, a conducting 
wire in contact with the electrode and embedded in the substrate. Then the field extends 
down into the substrate (along the surface of the plasma wire), while the plasma expands 
radially by ambipolar diffusion. Finally, when the plasma density approaches the 
background doping density, the depletion region reforms, which cuts off the funnel. When 
funneling was first identified, it was viewed as a serious problem. The point of the 
schematic in Fig. 22 is that the struck bit is much more likely to upset with funneling. The 
figure shows an array of circuit nodes, with the charge collection profile expected from 
diffusion alone, and with significant funneling included. For the critical charge indicated, 
funneling is the difference between an upset and no upset. However, from our vantage point 
today, funneling is much less of an issue than it might have appeared, for several reasons. 
First, highly doped layers cut off funneling relatively quickly, because the plasma density 
reaches the background doping density more quickly. Today, the widespread use of 
retrograde well technology 59 means there is often a highly doped layer just below the active 
device region, which effectively cuts off any funnel almost immediately. Second, the 
amount of charge on an electrode has shrunk as feature sizes have gotten smaller and 
voltages have been reduced — it is not clear that fields can be maintained in the substrate in 
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today’s devices in any case. Of course, funneling, as in Fig. 22, would probably be 
considered a good thing if it occurred. Error correcting codes would fix the one struck bit, 
and less charge would diffuse to neighboring cells, so multiple bit upsets would be less 
likely. For this reason, diffusion seems to be becoming relatively more important again, 
receiving renewed attention. An example of test data where diffusion seems to play a clear 
role is shown in Fig 23. 60 Normally, there is a saturated value of the upset cross-section, 
which corresponds to size of the sensitive structure (DRAM storage capacitor, for example). 
In the Figure, the largest measured cross-section is about 4 pm, 2 which is almost the 
published cell size, 61 and much larger than the storage capacitor size. Clearly, if the ion can 
cause an upset by hitting anywhere in the cell, diffusion of charge to the storage capacitor is 
a critical mechanism, and it would have to treated adequately in any kind of error rate 
calculation. 
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Figure 20. Charge distribution during funneling: to — ion strike creates a 
plasma filament in the Si; tj — the highly conductive charge column collapses 
the depletion region, starts to expand by ambipolar diffusion; t 2 — expansion 
continues; t 3 — when charge density in the column approaches background 
doping level, the depletion region starts to reform; ts — depletion region 
completely reformed. 
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Figure 21. Potential distribution in the substrate during tunneling: to — 
depletion regionat the time of the ion strike; ti and t 2 — potential extends into 
the substrate as the charge column expands; t 3 and U — potential in substrate 
reduced as the depletion region reforms; ts — depletion region completely 
reformed. 
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Figure 22. Charge collection across an array, with enhanced collection by 
tunneling at the struck node. 
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Figure 23. Upset cross-section as a function of LET; lack of a saturated cross- 
section indicates importance of diffusion. 60 At high LET, cross-section is 
approaching published cell size. 61 
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Finally, we note that in structures with highly doped epi substrates, parasitic bipolar 
devices were sometimes observed to be turned on by ion strikes. This effect was called the 
ion shunt effect, 62, 63 and it often meant that the collected charge in the circuit exceeded that 
deposited by the ion. We will discuss parasitic devices of this kind further in later sections. 

In insulators charge transport and trapping are relatively easy to simulate, because 
there are usually only a few hundred charge pairs in the oxide. The exact number depends, 
of course on the LET of the ion, oxide thickness, angle of incidence, and electric field. A 
typical case is illustrated in Fig. 24. The holes are deposited by the ion, and those escaping 
recombination are allowed to transport following the CTRW theory. 34 In this case CTRW 
means that the holes hop about lnm at a time, parallel to the total field, which includes both 
the applied field and the space charge field from the coulomb interaction with the other 
charges in the problem. Basically, the holes generated near the interface reach the interface 
first and are trapped. They set up a space charge field, which limits the charge density near 
the center of the distribution at the interface. Charges generated farther from the interface 
move radially because of the coulomb forces from the other charges, until they are far 
enough out. Then the applied field pushes them to the interface. In effect, trapping at the 
interface happens roughly in a series of concentric rings, with the outer rings filled last. The 
final charge distribution at the interface is illustrated in Fig. 25, which shows the top view of 
the Si/SiC >2 interface, with each dot representing a trapped charge. When device sizes were 
shrunk to the point that they were comparable to the diameter of the footprint of charge in 
Fig. 25, single ion hard errors (stuck bits) began to be observed. We will discuss this subject 
in more detail later. 
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Figure 24. Charge transport in gate oxide — space charge fields from charge 
trapped at the interface force charges transporting later to move away from the 
ion track. 
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Figure 25. Charge trapping for two different trapping efficiencies, each dot 
represents one trapped charge at the interface. Ion enters the oxide above the 
interface at an angle of 45 degrees from vertical, and travels along the x-axis. 
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3.0 Device and Circuit Effects 


In this section we will discuss SEE in devices and circuits. Many of the topics in this 
section have been covered more than once in previous short courses, from different 
perspectives and in differing levels of detail. Two previous Short Course presentations that 
have been particularly useful are by Johnson and Galloway, 64 and by Dodd. 65 


3.1 Upset 

Upsets from single particles (SEU) were first reported at this conference by Binder et 
al. in 1975, 66 who analyzed a simple (by today’s standards) bipolar flip-flop circuit. 
However, SEU did not achieve widespread recognition, especially in the commercial 
semiconductor industry, until May and Woods 67 reported alpha-particle induced SEU in 
high-volume Intel DRAMs (4K and 16K) and some SRAMS. Pickel and Blandford, 68 ’ 69 and 
Johnston 70 have considered how device scaling will affect the upset rate. In 1982, Pickel 
and Blandford calculated that the future sensitivity to upset would remain roughly constant, 
despite continued scaling. Although smaller devices would have lower values of critical 
charge, the sensitive volume would also be smaller, so the charge would have to be 
deposited in a shorter path-length. They calculated these two effects would roughly offset, 
so that the upset rate would be approximately the same. About twenty years later, Johnston 
reviewed the trend in upset rates over that period, and found that the rate had, in fact, been 
nearly constant. 


3.6.1 DRAMs 

The mechanism for SEU in DRAMs is illustrated in Fig. 26, which is taken from 
May and Woods. The two states of the cell are indicated — either the well of the storage 
capacitor is filled with electrons, or it is empty. When an alpha particle passes through the 
well, electron-hole pairs are created in the Si, with the holes diffusing into the substrate, and 
the electrons being collected in the well of the storage capacitor. If the well was already full 
of electrons, no change of state occurs. But if the well is empty, the electrons tend to fill it, 
and if the well becomes full enough, it is sensed as full, and the bit flips — changes state. 
This effect was also called a soft error, because the bit could be reset, and the cell would 
work as well as ever. The critical charge to cause an upset was the number of electrons that 
had to be collected in the well to cause an upset. Generally, it was half the full-rail voltage 
swing (for example, 2.5 V in a 5V part) times the storage capacitance, C s . May and Woods 
performed several experiments involving shrinking the storage capacitor dimensions, or 
varying the oxide thickness, and found that the error rate was extremely sensitive to C s , or 
critical charge. 
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Figure 26. DRAM upset, after may and Woods. 67 If an empty well is struck by 
an ion, it becomes partly full, which may be detected as a change of state. A 
full well remains full, so no change of state is detected. 


3.6.2 SRAMS 

In SRAMs, there are two main types of cell designs to be considered, six transistor 
(6T) which is illustrated in Fig. 27, 69 and four transistor (4T) which is shown in Fig. 28. 72 In 
the 6T cell, the typical upset mechanism is that an ion strikes the drain on the right side of 
the Fig. 27, which is biased high initially. The ion strike pulls down the voltage on drain B, 
which also pulls down the voltage on the gates of the inverter on the left side, which tends to 
turn on the “off’ p-channel device and the “on” n-channel device is turned off. When 
current flows through PI, the drain B is charged high, which raises the gate voltages on the 
right side of the Fig. 27, turning P2 off, and N2 on, leaving the cell in the opposite state from 
where it started. Simultaneously, however, current is flowing through P2, which is on 
initially, to restore node B to its high state. If the cell recovers by this mechanism before the 
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feedback loop can be closed, no upset will be observed. If the feedback process is 
completed before the recovery process, then an upset occurs. It is a race between feedback 
and recovery. The purpose of hardening with poly resistors is to introduce another RC 
delay, slowing down the feedback process, so that the recovery process always wins. Dodd 
et al. 71 performed modeling, which indicated that more charge is collected in a 6T cell if the 
cell recovers than if the cell changes state. The reason is that when transistor P2 switches, it 
cuts off the charge collection, which continues unless P2 switches. The concept of critical 
charge loses its meaning, in this case. 


V DD 




Figure 28. Four transistor (4T) static RAM cell, after Diehl-Nagle. 72 
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In the 4T cell, Fig. 28, the situation is somewhat different. Instead of active p- 
channel load devices, the cell has two large poly-Si resistors. This cell design was attractive 
to the commercial industry because the cell area was reduced by about a third if one third of 
the transistors were eliminated, which reduced the cost of the chip by about a third, too. The 
resistors were built in a layer of poly-Si on top of the transistors, so there was no area cost 
associated with them. The problem of an ion strike in a 4T cell was first analyzed by Diehl- 
Nagle, 72 who identified what she called a disturbed condition following an ion strike. If an 
ion struck node A, which is biased high initially, it would pull the voltage down, tending to 
turn N3 off. Eventually, the current flowing through the resistor to A would restore the cell 
to its correct state. The problem was that the restoring current flowed very slowly, because 
the resistor was very large, as it had to be to keep power consumption within reasonable 
bounds. In Diehl-Nagle’s analysis, most of the data was obtained on 64K SRAMs, and the 
value of the resistors was about 10 11 ohms. Therefore, with 5 V applied, the current through 
the resistor was about 50 pA, which meant that it took on the order of 1 ms for the cell to 
recover. If it was read in that interval, the likely result would be an incorrect read. 
Typically, a struck cell would be disturbed for 10 4 or 10 5 read cycles, so there was a 
significant probability of such an incorrect read. The other problem with the basic 4T 
design was that in each subsequent technology generation, the resistors had to have 4x 
greater resistance to maintain the same power consumption, because (of course) there were 
four times the number of cells. Therefore, the recovery time of a disturbed cell also 
increased by 4x in each generation. Almost all the commercial 1M SRAMs ever sold had 
4T cell designs, but this tradeoff between power consumption and alpha particle immunity 
forced the industry to begin switching to other approaches after that. 


3.6.3 Commercial Industry Hardening 

Although it is widely accepted in the radiation effects community that the 
commercial industry does not care about radiation hardening, the fact is the industry cares 
very much about alpha particle immunity. And the industry has taken a number of specific 
actions to try to improve alpha particle immunity, which can be described as hardening 
approaches. We will specifically discuss five of these approaches. The first, and most 
obvious, thing was to increase the storage capacitance in DRAMs, to increase the critical 
charge. An early example of this approach is illustrated in Fig. 29. 73 Of course, thinning the 
gate oxide is a major feature of scaling in general, but Intel had made a concerted effort to 
thin the oxide in the storage capacitor (even) faster than they thinned the gate oxide. This 
was in the days when the storage capacitor was a planar structure with a pure Si 02 dielectric. 
After about the 1M generation of DRAMs, cell area was too small to get reasonable storage 
capacitance in a planar structure, so the companies went to trench capacitors, or stacked 
dielectric structures. The industry also began looking at other dielectric materials with 
higher dielectric constants, nitrides and oxy-nitrides initially, and other things like BST 
(barium strontium titanate) later. Details of the capacitor processing are usually not 
revealed, but the major companies do sometimes publish papers stating what the storage 
capacitance is. In Table II, we summarize published values of C s by company and by 
generation of chips, which are plotted in Fig 30. 61, 74 ' 116 The scaled value of capacitance 
would shrink in very generation, according to simple scaling, but the actual capacitance has 
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been fairly stable over several generations, indicating increasing amounts of ingenuity in the 
design and construction of the capacitor structure. 
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Figure 29. Storage capacitance for selected DRAMS, after T.C. May. 73 
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Figure 30. Storage capacitance for DRAMS, by generation of technology, and 
by company. 
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Second, we have already touched on the use of retrograde well technology 59 to 
create a highly doped buried layer under the active device region, which cuts off charge 
collection from an ion strike, The basic approach is illustrated in Fig. 31. The use of high- 
energy (MeV) implanters for this purpose' 17 ' 119 is an important supporting technology. A 
high- energy implanter is used to create a heavily doped layer at some depth, determined by 
the range of the ions in Fig. 31. Then a short high temperature activation step leads to the 
dopant profile indicated. In a conventional process, a shallow implant is used to produce a 
high dopant concentration near the surface, which has to be driven in with a longer high 
temperature step, producing a profile similar to that shown in Fig. 3 1 . The retrograde well 
approach is used in both SRAMs and DRAMs. 120 ' 122 



Figure 3 1 . Retrograde well technology using high-energy implanter, compared 
to conventional well. 59 

Third, the industry has basically abandoned the 4T cell, for the reasons we have 
already mentioned. If the resistors are large enough to limit power consumption to 
reasonable levels, the alpha particle problem grows out of bounds. Instead, recent SRAMs 
have been built with a 6T cell design. To maintain the area (and cost) benefits of the 4T 
approach, the p-channel devices are TFT (thin film transistors) fabricated in a layer of poly- 
Si on top of the n-channel devices. This approach is illustrated in Fig. 32. These p-channel 
devices do not have the electrical characteristics of devices made in crystalline Si, because 
the starting material is not of the same quality. But the on-current is typically 10 5 or 10 6 
times the off current, which means that the voltage recovers much faster after an ion strike, 
than in a comparable 4T design. And because the p-channel devices are on top of the n- 
channel devices, there is no area penalty, compared to a comparable 4T cell. 
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Figure 32. Thin film transistors (TFT) — commercial SRAMS now have 6T 
designs with p-channel devices fabricated in a layer of poly-Si on top of the n- 
channel devices. 



SRAM Generation (Mbits) 

Figure 33. Soft error rate by generation of SRAM, with approximately order 
of magnitude improvement when B 10 is eliminated, after Baumann. 123 
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Fourth, it has been shown by Baumann 123 that most of the soft errors observed in 
some processes are mostly due to the interaction of B 10 with thermal neutrons from sea-level 
cosmic rays. Eliminating B 10 from the process reduced the observed soft error rate by about 
90%, as shown in Fig. 33. Therefore, other companies have also begun to eliminate B 10 
from their processes. 

Fifth, and finally, it is ironic — given the resistance of some in the radiation effects 
community to plastic packaging — that one of the reasons the industry first started going to 
plastic packages was for radiation hardening. The plastics have much lower concentrations 
of alpha emitters than the materials they replaced, which is illustrated by the data in Table 
3. 124 If one could reduce the alpha error rate by one, or two, or three, or even more, orders 
of magnitude by changing packages, it was a powerful incentive to do so. Once the industry 
discovered that plastics were also cheaper, there was no holding them back. 


Table 3. 124 Alpha-Particle Emission Rates of Processing Films, Leadframes, 
and Packaging Materials 


Material 

Emission Rate 
a P/(cm 2 • hour) 

Bare silicon 

0.00020 


0.00164 

Si + plasma oxide 

0.00188 

Si + plasma nitride 

0.00433 

Si + tungsten 

0.00308 

Si + aluminum 

0.00682 

Si + polysilicon 

0.00098 

Si + field oxide 

<0.00010 

Si + BPSG 

<0.00010 

Si + CVD nitride 

<0.00010 

Fully processed w/o WSi x 

0.02400 

Fully processed + WSi x 

0.04230 

Polyimide die coat 

< 0.00010 

DIP leadframe 

0.00677 

Zip leadframe 

0.00258 

256 K DIP 

0.00124 

64K DIP 

0.00109 

Metal package lid (vendor A) 

0.015 

Metal package lid (vendor B) 

0.030 

Ceramic package lid (vendor A) 

0.15 

Ceramic package lid (vendor B) 

3.10 


0.00080 

Ceramic DIP (vendor A) 

0.02320 

Ceramic DIP (vendor B) 

0.03230 

Ceramic DIP (vendor C) 

0.02610 

Ceramic LCC 

0.02530 
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3.2 Latch-up 


It is well-known that a high current state, known as latchup, can occur anytime there 
is a four layer n-p-n-p structure. Such structures are inherent in CMOS, where n-channel 
transistors and p-channel transistors are located side by side. The structure must be 
regenerative, in that there is a mechanism to increase the current to very high values once 
the threshold values for establishing latchup have been reached. Latchup can be triggered in 
different ways, electrically by applying high voltage, by radiation, and by single heavy ions. 
Heavy ion-induced latchup was first reported by Kolasinski et al. 125 at this conference in 
1979. Latchup has been modeled in terms of two bipolar transistors, since at least 1973. 126 
Although the two-transistor model is highly simplified, it is a useful way to introduce the 
main features of latchup, and it is widely used. The structure is illustrated in Fig. 34. 127 The 
n+ source or drain (emitter), p-substrate (base), and n-well (collector) form a lateral bipolar 
parasitic device. The other, vertical, parasitic device is formed by the p+-source or drain 
(emitter), n-well (base), and p-substrate (collector). For each device, the collector is also the 
base of the other device, which leads to a positive feedback loop between the devices. 
Latchup is initiated when an ion strike causes current to flow in the well/substrate junction, 
which causes a voltage drop in the well. This voltage drop forward biases the vertical 
device, and the gain of the device results in increased current into the substrate. This 
substrate current causes a voltage drop in the substrate, which turns on the lateral device. 
This, in turn, results in increased current flow back to the base of the vertical device, 
initiating the positive feedback loop. The resistances shown in Fig. 34 are not fixed 
resistors, as indicated in the Figure, but rather distributed resistances. Their exact values 
depend on the detailed geometry, including the position of the ion strike, which is one 
reason it is difficult to deal with latchup analytically. The I-V characteristic of a latchable 
structure is shown in Fig. 35. In this case, the latchup is initiated electrically, by applying a 
high voltage. The curve is linear in Region I, the forward blocking region. When the voltage 
reaches the breakover voltage, there is a negative resistance region. In the latchup region, 
Region II, the latchup will be stable as long as voltage exceeds the holding voltage, the 
current exceeds the holding current, and the gain of the vertical and lateral devices is large 
enough. Johnston points out that this last condition is often stated as PvPl> 1, but that the 
reality is more complicated because currents flowing in the well and substrate represent 
losses, which should be accounted for. Latchup is a major concern because it can cause 
catastrophic failure from excessive heating of active devices, or metallization, or bond wires. 
The most effective way to bring devices out of latchup is to turn off the power, reducing the 
voltage below the holding voltage. Even when a latchup is not immediately catastrophic, 
recent data indicates there is sometimes latent damage present after normal device operation 
is restored. 128 After non-destructive latchup, damage to interconnects was found that was 
visible in SEM pictures, even though the electrical characteristics of the circuit appeared to 
be normal. The implications of these results for the long-term reliability of the circuits are 
still unclear. Procedures for setting limits for current detection and shutdown for latchup 
protection may also need to be reconsidered. 
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Figure 34. Two transistor model for latchup, after Johnston. ' 



V 


Figure 35. Latchup IV characteristic, showing forward blocking, breakover 
voltage, holding voltage and current, after Johnston. 127 

Since latchup can be initiated by purely electrical stimulus, the commercial 
semiconductor industry pays some attention to process steps that minimize latchup 
sensitivity. Of course, immunity to electrically induced latchup does not guarantee 
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immunity to single-ion-induced latchup, but process changes that reduce sensitivity to one 
usually also help with the other. These process changes are intended to prevent the parasitic 
bipolar devices from turning on (becoming forward biased) or to reduce the gain, so that 
they are less likely to stay on. The process changes include increased doping levels in the 
well and substrate, and increased well depth, both of which reduce series resistance (and, 
therefore, voltage drops). Use of retrograde wells or epi substrates also reduce charge 
collection volumes. At one time, it was thought that an epi substrate was sufficient to 
eliminate latchup completely, but a number of counter-examples have now been observed, 
as illustrated in Fig. 3 6. 127 Indeed, one of the most sensitive parts ever tested (AMD K-5) is 
on a thin epi substrate. Trench isolation and guard bands have also been used to reduce 
latchup sensitivity. 127 Deeper trenches are more effective, of course. Guard bands impose an 
area penalty. 



Figure 36. Latchup thresholds for selected circuits on epi substrates, after 
Johnston. 127 The AMD K-5 is one of the most sensitive parts ever tested. 


Generally, the impact of scaling has been to make circuits more sensitive to latchup, 
because structures are closer together. Of the mitigation techniques we have just discussed, 
increased doping and trench isolation are the only ones consistent with continued scaling. 
Deeper wells may help, but scaling usually leads to shallower wells. Guard bands would not 
be expected in the most advanced technology. However, the trend toward increased 
sensitivity is likely to be reversed very soon, because power supply scaling will soon reduce 
Vdd below the latchup holding voltage in many applications. Typically the holding voltage 
is about IV, and the ITRS projects Vdd for high performance desktop applications to be IV 
this year, and below IV in 2005. Other applications vary by a few years, but the trend to 
lower operating voltages is clear. Also, if continued scaling forces the industry to adopt SOI 
at some point, latchup will be eliminated, because four layer structures will be eliminated — 
three layer transistor structures will be separated by dielectric isolation. However, snap- 
back, sometimes referred to as three layer latchup, will become more of an issue in SOI, as 
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we discuss in the next section. There are many other authors who have discussed latchup in 
more detail than is possible here, and the reader may wish to consult their work. 129 ' 139 

There are a number of unique testing issues associated with latchup, primarily 
because it depends on physical processes, e.g. charge diffusion, deep in the substrate. 
Latchup is a slower process than upset, requires longer-range ions than upset, and is strongly 
temperature dependent, with high temperature being worst case. Thermal generation of 
carriers contributes to establishing latchup. 


3.3 Snap-back 

Snap-back is another regenerative, high current mode related to parasitic bipolar 
action, which was first analyzed by Ochoa et al. 140 It differs from latchup in that it occurs in 
a three layer structure, a single MOSFET, and not a four layer structure. The source/well or 
substrate/drain regions of a MOSFET also represent a parasitic npn or pnp bipolar device, 
which turns on in snap-back. The source-drain breakdown characteristic of the MOSFET 
has a negative resistance region, which results in a stable, high current, low voltage- 
operating mode. Qualitatively, snap-back may appear to be similar to latchup, but it is not 
strongly temperature dependent, can be eliminated by reducing the voltage on the gate of the 
affected device (without cycling power to the whole circuit), and generally involves much 
lower levels of current in the whole chip. The micro-latches sometimes observed in testing 
complex circuits, localized high current regions, which cause a small increase in total chip 
current, may be individual devices in snap-back mode. Snap-back can be initiated, much 
like latchup, by high voltage, by high dose-rate radiation and by single heavy ions. The 
analysis by Ochoa et al. was based on dose-rate upset, but ion induced upset was analyzed in 
detail by Dodd et al. for SOI devices, 141 and it has been observed experimentally by Koga 
and Kolasinski. 142 

Some authors have used the term second breakdown interchangeably with 
snapback, 143 but second breakdown is a more general term than snapback, as we have 
defined it here. Second breakdown has been reported in many different kinds of devices, 144 ' 
146 and is apparently triggered by a number of different mechanisms. 143 At least, there 
seems to be no consensus on what the triggering mechanism is. Generally, any device, 
where the I-V characteristic resembles Fig. 37, is said to undergo second breakdown. This 
is also called snapback because the voltage snaps back to a lower value from a higher value, 
as the current goes up. 
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Figure 37. IV Characteristic for second breakdown or snapback. 


3.4 Burn-out 

Single event burnout (SEB) is typically observed in power transistors, both 
MOSFETs and bipolar. A schematic of a double-diffused power MOSFET is shown in 
Fig. 38, with an expanded view of the parasitic device in Fig. 39. 64 The MOSFET drain 
contact is on the backside of the wafer (not shown). The parasitic device is inherent in the 
MOSFET structure. A bipolar power device is shown in Fig.40, 64 which is very similar to 
the parasitic structure in the MOSFET. SEB is triggered when an ion passes through 
(usually) an n-channel device biased “off,” with high blocking voltage. The currents 
generated by the ion turn on either the parasitic or the active bipolar device, and trigger a 
regenerative feedback mechanism, second breakdown, or snap-back. In second breakdown 
here, the ion track forms a plasma filament connecting source and drain, which leads to a 
high current condition, associated with an abrupt drop in the breakdown voltage and a 
negative resistance region similar to latchup. The low breakdown voltage means that 
avalanche multiplication of the injected current takes place, leading to positive feedback and 
a stable high current condition — in effect, a permanent short between source and drain. If 
the current is not limited somehow, it will eventually bum out the interconnects, destroying 
the device. 
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Figure 38. Power MOSFET, with parasitic npn bipolar device, after Galloway 
and Johnson. 64 



Figure 39. Parasitic bipolar device from Figure 38, expanded view, after 
Galloway and Johnson. 4 
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Figure 40. Bipolar power transistor, after Galloway and Johnson. 64 
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A number of models have been used to develop an understanding of the physical 
processes involved in SEB. The first, and simplest, of these is the Current-Induced 
Avalanche (CIA) model, by Wrobel et al., 147 ' 149 which was used to show how the field 
distribution changed in a device as the current level increased. The field reflected the 
concentration of ionized dopants, but also the concentration of free carriers. At high enough 
current levels, avalanche multiplication started, explaining the regenerative feedback 
mechanism, which lead to burnout. The field distribution at increasing current levels is 
illustrated in Fig. 41. Regenerative feedback starts when the field at the epi/substrate 
interface becomes high enough to cause avalanche multiplication of the injected current. The 
second model was outlined by Hohl and Galloway, 150 and extended by others, 151 ’ 152 and 
attempted to better quantify the regenerative feedback mechanism. They solved the Poisson 
equation for the base-collector depletion region, to determine the number of avalanche 
generated holes for a given number of injected electrons. The results are shown in Fig 42. 
The first peak corresponds to reduced avalanching because the field in the depletion region 
is reduced, as the depletion region extends into the collector. The minimum avalanche 
region corresponds to the case where the injected electron density is comparable to the 
doping. The third region, where the hole concentration increases roughly proportionally 
with the electron concentration, corresponds to high fields at the epi/substrate interface. 





Figure 41. Current induced avalanche, after Wrobel. 147 As current density 
increases, high field at epi/substrate interface leads to avalanche injection, and 
regenerative high current mode. 
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Figure 42. Avalanche multiplication, corresponding to high field at the 

epi/substrate junction, after Hohl and Galloway. 150 

These simple, analytical models were very useful in developing an understanding of 
the basic physical processes involved in burnout, but they did not capture all the relevant 
device physics. For detailed quantitative work, complex simulation tools are now available, 
and widely used. For example, one such simulator is based on the MEDICI tool. 153 Another 
simulator developed, by Kuboyama et al., is based on the PISCES model, but with custom 
features for burnout analysis. 154 

Burnout testing can be difficult and expensive, because it is a destructive test, which 
can consume large numbers of samples. But it is possible to do nondestructive burnout 
testing by detecting the current spike from the device turning on, then turning the device off 
before it actually bums out. Normally, this makes it practical to do reasonably complete 
testing. One important test technique is the use of EPICS (Energetic Particle Induced 
Charge Spectroscopy), which is a pulse height measurement system used to monitor the 
charge collection in burnout testing. EPICS results are illustrated in Fig. 43. 155 The first two 
peaks are due to ions hitting different parts of the device, and the high charge spike at high 
voltages indicates the device turning on, which would be followed by burnout in the absence 
of current limiting. 

A number of mitigation techniques have been identified for reducing the probability 
of SEB, all of which are intended to make the bipolar device harder to turn on. Extending 
the p+ plug has the effect of reducing the base resistance, which means that higher current 
levels are necessary to forward bias the device. Reducing the source/drain bias reduces the 
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field in the base/collector depletion region, which reduces impact ionization, which makes 
the device harder to turn on. P-channel devices are much less susceptible to burnout than n- 
channel devices, so replacing n-channel devices with p-channel might be considered, 
although it is often not practical to do so. And burnout susceptibility is reduced at higher 
temperatures, so operating at higher temperature might be considered. However, higher 
temperature might also reduce the reliability of the device, or the system. 



10» 102 1()4 io« 
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Figure 43. EPICS (Energetic Particle-Induced Current Spectroscopy) results, 
showing high current spike, corresponding to parasitic device turning on, 
leading to burnout, after Kuboyama et al. 155 
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3.5 Gate Rupture 


Single event gate rupture (SEGR) results when the interaction of an ion with the gate 
oxide results in the destruction of the gate oxide, and a hard short between the gate and the 
substrate. It was first observed by Blandford et al. 156, 157 in MNOS non-volatile memories, 
and later by others. 158 Within a short time, it was also reported in power devices, 159 which 
have been the focus of most of the more recent work. In MNOS memories, the effect was 
typically observed only for high LET ions (LET = 35 or more), and only during erase/write 
cycles, when high voltage was applied. The circuits were used as “read-mostly” memories, 
with 5V read voltage, and rewritten only rarely, but with 12 V applied. The circuits with 
permanent errors were subjected to failure analysis, and localized conducting paths from 
gate to substrate were found. Localized damage was also visible in SEM pictures. The first 
reports of gate rupture in power devices were by Fischer, 159 confirmed by Wrobel in 
capacitor studies, 8 and the parametric dependences of SEGR have been extensively 
studied by Titus and Wheatley. 1 0 The experimental data is summarized in Fig. 44, 160 which 
shows the conditions needed to initiate SEGR for different incident ions. Two models for 
SEGR have been developed, a simple analytical model by Brews et al., 161 and the device 
simulation code, Athena. 162 The analytical model is illustrated in Fig. 45, where the “plasma 
wire” 56 serves as a conceptual starting point, but the presence of the gate oxide, the source, 
and the drain are important differences that are taken into account. In the n-channel device 
in Fig. 45, the electrons diffuse radially, and are collected at the drain, and the holes tend to 
pile up at the gate oxide, before being collected at the source. When the holes are 
concentrated at the oxide, they induce image charges on the gate, increasing the field in the 
oxide. Breakdown occurs if the field exceeds some critical value, estimated by both Fischer 
and Wrobel as approximately 

Ecr- (4 1 x 1 0 6 V/cm)/(LET) ,/2 . 

That is, an ion with LET = 37 would correspond to a critical field of about 6.7 
MV/cm, compared to a breakdown field of perhaps 10 MV/cm or more in the absence of an 
ion strike. Of course, numerical simulation is quantitatively more accurate, and a 2-D 
simulator has been developed and used with success by Allenspach et al. 163 ' 165 

SEGR testing is difficult, because there is no way to test nondestructive^. 
Therefore, one needs large numbers of samples to destroy, also large amounts of beam time, 
with very well calibrated beams. 

Techniques for reducing SEGR susceptibility have been identified. Increasing the 
oxide thickness obviously reduces the space charge field across the oxide, and, therefore, 
SEGR sensitivity. Of course, the effect of scaling is usually to reduce oxide thickness, so 
this method requires the manufacturer to adjust the process in a way that may seem 
unnatural. Reducing source and drain biases also reduces oxide fields. Pulling back the 
poly gate from the neck region reduces the fields in the neck region. And SEGR sensitivity 
is reduced at higher temperature, although high temperature operation may not be useful, for 
other reasons. 
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Figure 44. Initiation of Gate Rupture for different ions, and exposure 
conditions, after Titus and Wheatley. 60 
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Figure 45. Conceptual model for gate rupture, after Brews et al. 161 Space 
charge field from holes piling up at the oxide interface lead to breakdown. 
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3.6 Stuck bits 


The so-called stuck bit problem is a single ion effect in a memory, which can occur 
in both DRAMs and SRAMS. There are two mechanisms for stuck bits, which have been 
reported in the literature. 40, 41, 166 


3.6.1 Micro-dose — gate, field oxides 

The first of these is caused by the total dose deposited by a single ion passing 
through the gate oxide of a transistor. Obviously, this only happens in very small 
transistors, but it has been commonly observed for some time, now. The effect was first 
reported by Koga et al., 167 first shown to be due to single ions by Dufour et al., 168 analyzed 
in more detail, first by Oldham et al., 40 and later by Poivey et al. 41 The basic effect is that the 
trapped charge deposited by a single ion is enough to cause a small threshold voltage shift, 
which causes a small increase in subthreshold leakage current. The transport and trapping of 
this charge have already been described above. This is sometimes enough to cause the 
failure of an NMOS memory cell, in either a DRAM or in a four-transistor SRAM cell, 
because these cells are very sensitive to small leakage currents. We have already discussed 
the 4T cell, Fig. 28, in connection with upset. The problem here is damage to the gate 
region of one of the transistors. If the resistor is on the order of 10 12 ohms or more, which 
was typical for a 1M SRAM, the current flowing to the transistor was limited to a few pA, at 
most. If damage to the transistor meant leakage of more than than a few pA, then charge 
would leak off the drain faster than it could be replaced, and one side of the cell could not be 
held on. A typical I-V characteristic for an n-channel MOSFET is illustrated in Fig. 46, 
where the pA current level corresponds to a voltage of only about lOOmV. But on a chip 
with millions of transistors, the distribution of Vt values will include outliers around +/- 6ct. 
Poivey et al. concluded that, for the 1M technology they were testing, the standard deviation 
was about 10 mV, based on analysis and confirming data. The variation in threshold voltage 
across a die, and between die is illustrated in Fig. 47. For this reason, the devices with the 
lowest thresholds had very little margin, if struck by an ion. Future scaling will likely mean 
that the spread in threshold distributions will increase, even as the mean threshold is 
reduced. Smaller devices have fewer dopant atoms, and the standard deviation varies as 
N 1/2 , so there will be more relative variation in threshold voltages initially. Gaillard 169 
performed 2D device modeling to determine the effect of a spot of trapped charge, such as 
that shown in Fig. 25, on the device threshold voltage. He found that the threshold shift was 
largest when the spot of trapped charge is in the middle of the device, and large enough to 
cause device failure in many cases. Oldham et al. included oxide thinning in their analysis, 
and concluded that stuck bits would tend to go away in thinner, future oxides. But this has 
not happened as quickly as one might have predicted from that analysis. The likely reason 
was pointed out by Loquet et al., 170 who presented simulation results suggesting that a single 
ion in the bird’s beak region or the field oxide, could also cause a leakage path that would 
cause a bit to fail. 
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Figure 46. Typical I-V characteristic for an n-channel MOSFET. V 
corresponding to 10' 12 A is about lOOmV, but standard deviation is about 10 
mV. Variation in Vj means that some cells are much more sensitive to single 
ion total dose damage than others. 
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Figure 47. Number of failed bits in nominally identical SRAMs as a function 
of total dose, after Poivey et al. 41 Shape of each curve indicates distribution of 
threshold voltages on a given die. Difference between curves indicates 
variation in mean threshold between different die. 
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3.6.2 Micro-damage — track formation 


The second stuck bit mechanism was presented by Swift et al., 166 who observed 
shorted gates in memories with thin gate oxides, exposed to gold ions with LET of about 80. 
The mechanism was thought to be similar to SEGR. Memories have lower applied voltages 
than those commonly used in SEGR experiments, but the data suggested some kind of oxide 
damage mechanism. Representative data is shown in Fig. 48. This data was taken on 4M 
DRAMs, and failure to refresh was taken as the failure criterion. The data indicated as 
“stuck at zero” is consistent with low level leakage from damage to the pass transistor, and 
was attributed to micro-dose damage, as we have just discussed. The “stuck at one” data 
indicated failed devices, regardless of the refresh time. These devices seem to have hard 
shorts of the gate oxide, because they cannot hold a charge for any measurable interval. 
These devices did not recover in a high temperature anneal, indicating the failure was not 
due to total dose. Swift et al. concluded that the damage was similar to a gate rupture in 
these devices. 
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Figure 48. Stuck bit results, after Swift et al. 166 Lost zeros are attributed to 
micro-dose damage, because increased leakage current correlates with 
degraded refresh characteristic. Lost ones appear to be hard shorts because 
cells cannot hold charge for any measurable interval. 



3 .7 Single Even Transients 

Transients (SET) have been treated thoroughly in the Short Course, relatively 
recently, by Buchner and Baze 171 in 2001, in far more detail than is possible here. 
Basically, an ion passing through a circuit, which causes a voltage transient on a junction, is 
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an SET. If the transient occurs in a memory chip, it may cause an upset, which we have 
already discussed. Until relatively recently, SET in logic circuits received relatively little 
attention, because errors were rarely observed until feature sizes were scaled below about 
0.3 pm. SETs either did not propagate to a latch, or they were not captured. Logic gates 
switched slowly, compared to the duration of a transient, so the gates acted as low pass 
filters, filtering out high frequency noise. Experimentally, error rate was found to be 
independent of clock frequency, 72 indicating that logic gates did not contribute 
significantly to observed error rates. For this reason, SEE testing of logic circuits focused 
on the radiation response of the registers, since they determined the radiation response of the 
circuit. 


However, with continued scaling, feature sizes are well below 0.3 pm, and clock 
speeds are now 100s of MHz, and SETs in combinational logic have become a significant 
issue. Single event effects in analog circuits were first reported by Koga et al. in 1993, and 
confirmed in other reports. 173 ' 175 Because the transient duration is more nearly comparable 
to the clock cycle, is much more likely to propagate and to be captured at a register because 
it coincides with a clock edge. 176 ' 179 In highly scaled circuits, there is experimental evidence 
that the error rate increases with clock frequency, as illustrated in Fig. 49. The basic idea is 
illustrated in Fig. 50, where a bit stream, 01010, is sampled, with radiation events occurring 
in periods 3 and 4. The only error detected is in period 3, however, because the event in 
period 4 was outside the sampling window. The closer the sampling window is to the bit 
period, the more errors will be detected. Fig. 51 illustrates the situation at the maximum 
operating frequency, where a single bit period is shown, with the rise and fall times 
included. If the sampling window extends from position 1 to position 3, every bit will be 
detected as an error, which defines an upper limit to the operating frequency. Data showing 
an example of this effect are shown in Fig. 52, where the limiting frequency was determined 
by the test equipment, rather than the circuit under test. 

SET testing can be done in several ways. Wide beam accelerator exposures can be 
used to screen parts for space applications, provided the operating frequency and other 
variables accurately simulate space operation. However, these tests cannot illuminate the 
circuit mechanisms, because the exact location of the ion strikes cannot be determined. For 
that, one must use a focused ion beam, which makes it time consuming and expensive to 
test a complex circuit. Cf and Am laboratory sources have been used, usually for screening 
before accelerator testing, because they are convenient, but the particles are too short range 
to penetrate to the active region in some cases. Pulsed laser testing has been valuable for 
many things, because it is non-destructive, and the position, intensity, and timing of the 
pulses can all be controlled in the experiment. ’ 
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Figure 49. SET Test data, after Reed et al, 176 showing the error rate 
proportional to frequency. 



Figure 50. Sampling of 01010 bit stream, with transients in third and fourth 
cycles, after Reed et al. 176 Only the third bit will be detected as an error, 
because the transient in the fourth cycle was too short to be detected. As the 
circuit operating frequency increases, transient duration and sampling interval 
become nearly equal and the error rate increases. 
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Figure 51. Data pulse, with rise and fall times shown, after Reed et al. 176 If 
the sampling window extends from position 1 to position 3, every bit will be 
detected as an error, defining a maximum operating frequency. 
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Figure 52. SET test data, after Reed et al., 176 where error rate increase 
abruptly at the maximum operating frequency of the test equipment. When 
the test was repeated with faster equipment, results were qualitatively similar, 
but the maximum frequency was higher. 
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There are a number of modeling tools that can be, and have been, used to study SET. 
Device-level tools include PISCES, PADRE, ATLAS, and GENESIS. These tools solve the 
Poisson and carrier continuity equations for particular device structures, and are used to 
predict device response. Typically, the transient calculated for a device is then fed into a 
circuit analysis code, which calculates the circuit response. It is also possible to use a mixed 
mode tool, such as DA VINCI, which can model one transistor at the device level, and the 
rest at the circuit level. The difficulty with all of this is that in a complex logic circuit, there 
are many different transistors, and one does not know where or when in the logic flow the 
transient is generated. 

Hardening approaches usually fall into one of two categories — process techniques 
and circuit techniques. Process techniques are designed to reduce the transient by reducing 
the charge collection volume, by using epi layers, or well structures, or SOI substrates. 
Circuit techniques may be described as (1) charge dissipation, (2) filtering, or (3) spatial 
redundancy. Charge dissipation means adding capacitance or current drive, so that the 
critical charge for upsetting the circuit is increased. Of course, these approaches impose a 
power penalty. Filtering means slowing down the circuit so that the transient is faster than 
the circuit operation. The whole point of scaling is to produce faster, better performing 
circuits, so this approach imposes a significant performance penalty. Redundancy means 
building in multiple circuit elements, and voting them, which imposes an area penalty. 

SET is an area where the problem will clearly get worse very rapidly with continued 
scaling. It has recently become a problem because the duration of the transient is close to 
the period defined by the clock frequency. One of the main goals of scaling is to push the 
operating frequency higher and higher, so this will inevitably be more of a problem in the 
fixture. Hardening approaches all involve giving up the performance benefits of scaling. 


3.8 Hard/Soft Breakdown 


Ion induced hard breakdown has already been discussed, SEGR. Ion-induced soft 
breakdown (SBD) has only recently become the subject of active study, with the first papers 
at this conference in 2001. 183, 184 For this reason, soft breakdown is likely to be the subject of 
further study for some time to come. By soft breakdown, we mean a modest increase in gate 
oxide leakage current, which is probably due to a localized damage region, if the SBD is 
ion-induced. (SBD, induced by electrical stress, and by radiation, has already been 
discussed by the previous speaker (Paccagnella).) Massengill 184 concluded that the small 
increase in leakage current from SBD would not, by itself, have a significant impact in most 
applications. However, other, later, studies have indicated that if components are subjected 
to lifetime testing, after SBD, they generally fail early in the test. 5 ' 187 Clearly, then, it 
would be useful to understand the nature of the damage region created by an ion strike. 
There is an extensive body of literature on nuclear tracks in solids, which has been 
developed by a community that has flown a variety of different solid films as cosmic ray 
detectors. Much of this literature was reviewed by Fleischer et al., in a book published in 
1975, so the literature is not new. 188 Basically, when an insulator is exposed to cosmic ray 
bombardment, a disordered region is formed along the ion track, which is detected because 
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it has a higher etch rate than the undisturbed material. Etch pits indicate the path of the ions. 
These etch pits only form in insulators, such as glass (SiC^), but not in semiconductors or 
metals. Fleischer et al. discuss no fewer than seven models that had been proposed to 
explain various observations, and find none without difficulties. They conclude that the 
damage comes from interactions with free carriers, rather than direct atomic scattering, even 
though the damage consists mainly of displaced atoms. They also conclude that secondary 
effects, delta rays, can be neglected. To explain how they get displaced atoms from 
coulomb effects, rather than displacement damage, they coined the term ion explosion spike. 
The basic idea is that the concentration of ionized atoms in the track region is so high, that 
coulomb repulsive forces are sufficient to break bonds and move atoms out of their normal 
positions. Only in insulators, because of their low mobilities, do the carriers stay 
concentrated in a high-density region long enough for this process to happen. With higher 
mobility, the carriers would simply diffuse away. The authors dismissed direct 
displacement damage as a contributing mechanism because the length of the track, 
determined from the depth of an etch pit, is normally shorter than the nominal range of the 
ion. Since displacement damage is concentrated at the end of the range, there is usually no 
track formed where displacement damage is greatest. 

The idea that ion-induced soft breakdown is associated with defects along the ion 
path seems to be consistent with experimental observations. For example, Conley 183 
reported soft breakdown (increased leakage current) with no critical minimum fluence, and 
no critical field. That is, some leakage was observed with almost the first ion hit, although it 
scaled with fluence after that. And some increased leakage was observed with no applied 
field. Of course, there was more leakage at higher fields. The experiments reported by 
Conley were performed with 3.0 and 3.2 nm oxides, but results are somewhat different in 
thicker oxides. For example, Sexton et al. 189 performed similar experiments on thicker, 7.0 
nm, oxides, with qualitatively different results. There was no detectable increase in leakage 
with the first ion hit — a significant fluence had to accumulate first. Nor were effects 
observed at zero bias. Typically, 3V applied and 10 7 ion/cm 2 were the points at which 
increased leakage current was observed. Why the results should change qualitatively in this 
fashion with oxide thickness is not clear. Undoubtedly, further studies should be done, and 
will be done to shed more light on these questions. Perhaps the thicker oxides can be 
viewed macroscopically — an amorphous material, if disordered by an ion, is still an 
amorphous material. If it is thick enough, there may be no detectable electrical effects, even 
if some atoms are in new positions. In a thinner oxide, on the other hand, the oxide is only a 
few atomic planes thick to begin with, so one or two atoms out of position can cause 
measurable change in the electrical properties of the oxide. 

Results so far suggest that even the first incident ion can cause soft breakdown in 
thin enough oxides, and the reliability of the devices may be very poor after that. It seems 
clear that if these results are confirmed in future studies, the use of very thin oxides in space 
electronic systems will be a critical reliability problem. 


2003 IEEE Nuclear and Space Radiation Effects Conference Short Course, Monterey, California - July 21, 2003 54 


4.0 Software Solutions 


The goal in a system program is not to eliminate SEE, but to get to a manageable 
error rate. The software tools, for handling errors, clearly have a major influence in 
determining what error rate is manageable. To fully cover this topic would be far beyond 
the scope of this short course, but we want to touch on a couple of the most important 
points. 


4.1 Error correction 

The most widely used technique for error correction is called a Hamming code. 190, 191 
. If the word length is 2" bits, n+1 bits are necessary for single error detection and 
correction. One more bit is necessary for double error detection. For example, in a 64-bit 
word, n=6, and n+2=8 bits necessary for double error detection, single error correction. 


4.2 Built-in Self Test (BIST) 

BIST, or built-in self test, is an idea that has received a certain amount of attention in 
the commercial industry, starting many years ago now. ASICs and logic chips have had 
self- testing features for many years. But the idea has been attractive for memories also. 
The idea is to build in test hardware, and embedded software, so that a chip can test itself, 
and fix itself if failed components are found. As the level of integration increases, chips are 
more difficult to test, and the system impact increases if one of them fails, so the concept is 
very attractive. The problem in memories has been that the area required for the BIST 
hardware raises the chip cost, more than the benefits have justified. 192 ' 195 One early BIST 
approach added about a third to the chip area, which corresponds to a one-third cost 
increase. In a mature memory technology, BIST could not improve the yield by anywhere 
close to on third, so it was not cost effective. For this reason, no company had used BIST on 
any product memory chip, although many companies had experimented with it. Recently, 
however, there has been some progress at reducing the area required to implement BIST. 
One paper claimed the area penalty for their approach was less than one percent, about 0.6 
percent. If this approach works as well as claimed, companies might really implement BIST 
on product memory chips. BIST would be attractive from a radiation point of view, because 
it would be a complete solution for stuck bits, for example. On chip test hardware would 
identify failed bits, and replace them with backup bits. This ability would be useful for any 
other reliability problem, as well. 
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5.0 Conclusion 


This might seem like a good place to make predictions, except that we have shown 
that making predictions is not only difficult, but also dangerous. Even so, a few things seem 
clear. One is that scaling will continue, not forever, but for a reasonable time, yet. Scaling 
is the reason we have SEE, and the ability to control SEE is one of the main factors that 
control how fast scaling can proceed in the future. Of, course, SEE is a critical problem for 
military and space systems, but it is also an important commercial problem, and will remain 
so. 
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